How Efficient Is Mpeg-7 for General Sound Recognition?

نویسندگان

  • HYOUNG-GOOK KIM
  • JUAN JOSÉ BURRED
  • THOMAS SIKORA
چکیده

Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on several basis decomposition algorithms vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we evaluate three approaches: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into a continuous hidden Markov model (CHMM) classifier. Our conclusion is that established MFCC features yield better performance compared to MPEG-7 ASP in the general sound recognition under practical constraints.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MPEG-7 sound-recognition tools

The MPEG-7 sound-recognition Descriptors and Description Schemes consist of tools for indexing audio media using probabilistic sound models. The Descriptors provide containers for category labels, as well as data structures for quantitative information about sound content. We describe the normative tools, as well as informative methods, for automatic description extraction and sound matching.

متن کامل

KDD-Based Approach to Musical Instrument Sound Recognition

Automatic content extraction from multimedia files is a hot topic nowadays. Moving Picture Experts Group develops MPEG-7 standard, which aims to define a unified interface for multimedia content description, including audio data. Audio description in MPEG-7 comprises features that can be useful for any content-based search of sound files. In this paper, we investigate how to optimize sound repr...

متن کامل

Audio Environment Recognition using Zero Crossing Features and MPEG - 7 Descriptors

Problem statement: This study investigated zero crossing features and selected MPEG-7 audio descriptors for environment sound recognition applications such as audio forensics. Approach: The study implemented several experiments focusing on the problems of environment recognition from audio particularly for forensic applications. Results: It was investigated the effect of the temporal zero cross...

متن کامل

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio

Like with pictures, humans talk about a "sound image" when they try to characterize an acoustic scene containing salient spatial aspects. This talk will review the basic aspects of stereophonic / multi-channel audio that determine the perceived sound image and will outline how these aspects can be represented efficiently. One of the most remarkable innovations in this context was the recent dev...

متن کامل

A multidomain approach for automatic home environmental sound classification

This article presents a multidomain approach which addresses the problem of automatic home environmental sound recognition. The proposed system will be part of a human activity monitoring system which will be based on heterogeneous sensors. This work concerns the audio classification component and its primary role is to detect anomalous sound events. We compare the discriminative capabilities o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004